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Matching- Vector Families and LDCs Over Large 

Modulo 

Zeev Dvir* Guangda Hu^ 



Abstract 



We prove new upper bounds on the size of families of vectors in Z^ with restricted modular 
inner products, when to is a large integer. More formally, if Ux, . . . ,Ut £ 2™j and Vi, . . . ,v t <E Z 7 ^ 
>-J , satisfy (v,i,Vt) = (mod to) and (u,i,Vj) ^ (mod to) for all i =/= j G [t], we prove that 

t < 0(to"/ 2 + 847 ). This improves a recent bound of t < m »/2+0(iog(m)) by [Bp L1 3| an d is t ^ e 
best possible up to the constant 8.47 when m is sufficiently larger than n. 

The maximal size of such families, called 'Matching- Vector families', shows up in recent 

constructions of locally decodable error correcting codes (LDCs) and determines the rate of the 

<3\ ■ code. Using our result we are able to show that these codes, called Matching- Vector codes, must 

have encoding length at least K 19 ' 18 for K-bit messages, regardless of their query complexity. 



This improves a known super linear bound of K2 n ^ logK ^ proved in |DGYllj . 



1 Introduction 



E 

A Matching- Vector family (MV family) in ZJ^ is denned as a pair of ordered lists U = (iti, . . . , tit) 
and V = (v\, . . . , vt) with m, Vj £ ZJ^, satisfying the following property: for all i € [t], (v,i,Vi) = 
(mod m) whereas for all i ^ j € [t], (ui,Vj) ^ (mod m). Here (•, •} denotes the standard inner 
product. If one restricts the entries of the vectors in the family to be in the set {0, 1} the inner 
products corresponds to the sizes of the intersections (modulo m) and, in this case, MV families are 
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more commonly referred to as families of sets with restricted modular intersections. MV families 
were studied previously in the context of Ramsey graphs [GroOO], circuit complexity [BBR94] and, 
more recently, were used to construct Locally Decodable Codes (LDCs) [Yek08, Efr09, IDGYllj . 
which are error correcting codes with super-efficient decoding properties. We will elaborate more 
on the connection to LDCs after we state our results. 

We denote by MV(m, n) the size of the largest MV family in "L m (the size of the family is t in 
the above notation). It is an interesting (and mostly open) question to determine the value (or even 
order of magnitude) of MV(m, n) for arbitrary m and n. Upper and lower bounds on MV(m,n) 
can be roughly divided into two kinds, corresponding to the relative size of the two parameters. 
One typical regime is when m is small and n tends to infinity and the other is when m » n (of 
course there are intermediate scenarios as well). 

Although our work focuses on the regime when m is much larger than n, we first describe the 
known results for the other regime, namely when m is a fixed constant and n tends to infinity. 
These regime is further divided into the case when m is prime and when m is composite. When 
m is a small prime and n tends to infinity, the value of MV(m, n) is known to be of the order of 
n m-i |BF98J. When m is a small composite, the picture is very different and there are exponential 
gaps between known lower and upper bounds on MV(m, n). A surprising construction by Grolmuzs 
[GroOO] shows that MV(m, n) > exp I c • ^ — °, \ r -i ) when m has r distinct prime factors (here c is 
an absolute constant). That is, MV(m, n) can be super-polynomial in n (that is n^' 1 ') for m as small 
as 6 (compared with the polynomial upper bound n m_1 for prime m). A trivial upper bound on 
MV(m, n) is m n since an MV family cannot contain the same vector twice. The best upper bound 
on MV(m,n) for small composite m was proved in [BDL13J and is m n / 2 +0(iogm)_ Assuming the 
Polynomial-Freiman-Ruzsa (PFR) conjecture |TV07| this can be improved to MV(m, n) < C m os>>n - , 
with C m a constant depending only on m. 

Our work focuses on the regime when m is larger than n. In this setting, a construction of 

/ xn/2-l 

[YGK12J gives MV families of size ( m -^-k ) [YGK12J. For a large prime m, this construction 



almost matches an upper bound of 0(m n ' 2 ) proved in [DGYllj . For composite m, the best upper 
bound on MV(m, n) for large m prior to this work was the same m n ' 2+ ( lo s m ) bound from [B DL13] , 
Notice that, when m > 2 n , this bound is meaningless since it exceeds the trivial bound of m n . In 



TO 


upper bound for MV(to, n) 


general prime 


0{m n ' 2 ) [DGY11] 


small, fixed prime 


0{n m - L ) [BF98^ 


general composite 


m n/2+0(logm) ^ BD L13] 


small, fixed composite 


2 o m (n/io g nj ^BDL13J (assuming PFR) 


general composite 


0(TO n / 2+8 - 47 ) (Theorem HHJ 



Table 1: List of upper bounds on MV(m, n) 



this work we extend the proof method developed in [BDL13J to give the following bound: 

Theorem 1.1. For all integers m > 1, n we have MV(m,n) < 100to™' 2+847 . When to is a product 
of distinct primes the constant 8.47 can be replaced with 4 + o(l). 



For small n, this bound is tight up to the constant 8.47 as the [YGK12] construction shows. 
When to is small, this still gives some improvement over the m n / 2+0 ^°s m ) bound of [BDL13] but 
not as dramatic (and probably far from being tight). 



The main tool in our proof is Fourier analysis in the spirit of [BDL13] , with which we repeatedly 
reduce to to one of its factor (eventually reaching the case of to = 1). The distribution of (vi,Uj) 
over random i,j G [t] is far from the uniform distribution (since the probability of obtaining zero is 
small). This fact is used to find a large coefficient in its Fourier spectrum. This coefficient is then 
used to carve out a large sub family which is again an MV family, but over some proper factor of 
to. The proof ends when we reach the case of prime to. The difference between our proof and the 
one in |BDL13j is in the choice of the large coefficient (or character). We are able to show that a 
large character appears that has nicer number theoretic properties and so are able to analyze the 
loss in each step in a better way - getting rid of the O(logm) factor in the exponent. 



1.1 MV families and Locally Decodable Codes 

A (q, 5, e)-Locally Decodable Code, or LDC, encodes a if -symbol message x to an iV-symbol code- 
word C(x), such that every symbol Xi {i £ [K]) can be recovered with probability at least 1 — e 
by a randomized decoding procedure that makes only q queries to C(x), even if 5N locations 
of the codeword C(x) have been corrupted. Understanding the minimum length iV = N(k) of 
an LDC with constant q is a central research question that is still far from being solved. For 



q = 1,2, this question is completely answered. There are no LDCs for q = 1 [KTOO] and the 
best LDCs for q = 2 have exponential length [GKST02, KdW04 . However, for q > 2 there 
are huge gaps between lower bound and LDC constructions. The best known lower bound is 
N = n(K 1+1 /U r W-V) for k > 4 |Woo07| and N = Vi{K 2 ) for k = 3,4 [KdW041 IWooTO] . while 
the best construction has super-polynomial length. Constructions of LDCs have been studied ex- 
tensively for more than a decade. Until recently, all constructions of LDC with constant q had 
exponential encoding length. In a breakthrough work of Yekhanin [Yek08] and following improve- 
ments |Efr09l |Rag07j IKY091 HSlOl ICFL+lOl IDGY111 IBET10] . a new family of LDCs based on 



Matching Vector families was introduced. These codes, called Matching- Vector codes, rely on con- 
structions of MV families and can have sub-exponential length for q as small as 3 [Efr09j. Using 
Grolmuzs construction as a building block, one obtains an encoding length of roughly 



N ~ expexp ((logi^ ( loglos9 / log ^(loglogi<:)) • 



The size of the MV family used in the code construction is critical. In its simplest form, an MV 
code using an MV family of size t in Z^ will send K = t bits of message into N = m n bits of 
encoding and will require q = m queries to decode. Several improvements are possible for reducing 
the number of queries below m but these are case-based and hard to generalize for arbitrary m. 

Our improved bound on the size of MV families allows us to prove an unconditional lower bound 
on the encoding length of MV codes, regardless of the query complexity. 
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Theorem 1.2. For any MV -code with message length K and codeword length N we have N > Kw . 
This bound is regardless of the number of queries. 

This theorem improves on a bound of N > K2 "(viogA") p roV ed in [DGYllj . 

1.2 Organization 

We begin in Section [2] with a number of preliminary lemmas and notations that will be used 
throughout the proof. In Section [3] we prove our main technical lemma which is the heart of our 
proof. The lemma is used iteratively in the proof of our main theorem which is given in Section SJ 



The proof of the stronger bound for the case when m is a product of distinct primes is given in the 
appendix. 

2 Preliminaries 
2.1 Fourier lemma 



2ir,- 



We consider a probability distribution /i over Z m . Let uj m = e ~™ % be an order m primitive root 
of unity. It is not difficult to see E, x ^^[ujm] = for all j € {1,2, . . . ,m — 1} if /j, is the uniform 
distribution. We will show that E^^^m] is bounded away from zero for some j € {1, 2, . . . , m — 1} 
if // is far from being uniform. 

In |BDL13j . it was shown that 



max 
l<i<m-i 






if the statistical distance between \i and the uniform distribution is big, i.e. \ Sxez IM X ) — ™ I = 
J7(— ). In the following lemma, we prove a better lower bound that depends only on s = order(wm) 
under a stronger condition |/i(0) — -M = ^(^)- 

Consider fi as a function from Z m to C. For < j < m — 1, the Fourier coefficient /i(j) is 

A(i) = — Zl rt x ) u m x = — E [ w m 1- 

TO ^^ TO a;~M 

xez m 

One can see that /t(0) = — . The set of functions {w^ | < j < m — 1} is an orthogonal basis for 
all functions from Z m to C, and the function fj,(x) can be written as 

m— 1 

M(*)=j>o>c a) 

i=o 
Lemma 2.1. Ze£ /x : Z m h-> [0, 1] be a probability distribution over Z m (i.e. J2xeZ M x ) = -U* ^ 



A*(0) < 1Q q , i/iene must exist j £ {1, 2, . . . , to— 1} swc/i i/iai 



^ai~M i Wm J 



> ., t , where s 



sf(s) ' gcd(j,m) 

is the order of Lt? m for uj m = e~ l , and f : Z + i— y R is any function satisfying X^ 2 TTsT — 0-99- 



Proof. By setting x = in ([T]), we have 



m—l 



Mo) = E A(i)<° 



3=0 



m—l 

E 



m—l 



m = - + - E E [■ 



to', 



-.!■>-] 



Therefore 



m—l 

E 






> 



m—l 

E 



X^fl 



m—l 

E 



E K?1 



X^fl 



m 



M(0) 



1 
m 



> 0.99. 



(2) 



For every d | m (1 < d < m — 1), define Td = {j \ gcd(j, m) = d, 1 < j < m — 1}. For all j £ Td, 
the order of u> m is Sd = ^ (2 < Sd < ra). We also see T^ = {fc • d | 1 < k < Sd,gcd(k,Sd) = 1}, 
hence \T d \ = ip(s d ) < s d . 

If the lemma was not true, we have 



m—l 

E 

3=1 



E [<1 



X^fl 



E E 



E [<*] 



X^fl 



< 



E 

d 771 



Sd- 



Sdf(sd) 



This violates inequality (J2J. Thus the lemma is proved. 



< 



OO -. 

E^O.99. 



s=2 



D 



2.2 Notations and Facts about MV Families 



We use {•, •) to denote the inner product over Z between two vectors. In all calculations, we identify 
Z m as {0, 1, . . . , m — 1} and treat the numbers as on Z. Conventionally, we consider a mod 1 to be 
for any integer a. 

Notation 2.2. Let r be a positive integer. For an integer v, define v^ r ' € {0, 1, ... ,r — 1} to be v 
modulo r. For a vector v = (y\, V2, ■ ■ ■ , v n ), define v^' = (v^ , v^ , • • • , t% )■ For a list of vectors 
V = (v 1 ,v 2 ,...,v t ), define V^ = {v ( (\vt\...,vt ) ). 

Notation 2.3. Let r be a positive integer. For an integer v, define v'- 7 '' G Z to be (v — v^ r ')/r. For 
a vector v = (t>i,t>2, • • • ,v n ), define i>M = (v\ ,v% , • • • ,Vn )• Thus v = rv*- r ' + v( r > for any vector 
v. For a list of vectors V = (vi,V2,- ■ ■ ,Vt), define V<- r > = (v\ , vk r , • • • , v t )■ 



Definition 2.4. Let U = (wi,u 2 , ■ ■ ■ ,Ut) and V = (t>i,i> 2 , ■ ■ . , Vt) be two lists of vectors in Z^. 
(U, V) is a matching vector family if (ttj, Vi) = (mod m) for all i G [t] and (iii, Vj) ^ (mod m) 
for all i 7^ j G [t]. The number t is the size of the MV family and is denoted by \(U, V)\. 

Claim 2.5. For an MV family (U, V) where U = (wi, «2j ■ ■ ■ , u t ), V = {v\,V2, ■ ■ ■ ,v t ) and i 7^ j G 
[t] , we have Ui 7^ Uj and Vi 7^ Vj . 

Proof. Assume u^ = Uj for i 7^ j, we have (v,i,Vj) = (ui,Vi) = (mod m). This violates the 
definition of MV family. □ 

Notation 2.6. Let U,V,U',V be 4 lists of vectors in ZJ^ ; and say U = (u±,U2, ■ ■ ■ ,Ut), V = 
(vi, v 2 , ...,v t ). We write (U' , V) C (U, V) if there exists a set T C [t] such that U' = (m : i G T) 
and V = (vi : i € T). Observe that if (U, V) is an MV family, so is (U' , V). 

Definition 2.7. (ri,r 2 ,r3) is a partition of m i/ri,r 2 ,r3 G Z + andr\r2rj, = m. (Vi,r2,r"3 are not 
assumed to be coprime.) 

Definition 2.8. For an MV family (U, V) where U = («i, « 2 , • • • ,u t ) and V = (i>i,i> 2 , • • • ,Vt), we 
say (U,V) respects (ri,r2,r^), where (ri,r2,r^) is a partition of m, if the following conditions are 
satisfied: 

1. 3uo G Z™ such that u[ = i*o for all i G [t], 

2. 3v G Z™ such that vf 2 ^ = v for all i e[t], 

3. (uf ,Vq) modulo r2 is the same for all i G [t], 
4- (uq,v[ 2 } modulo r\ is the same for all i G [t\. 

Claim 2.9. If an MV family (U,V) respects (ri,r2,r^), then (v,i,Vj) = (mod r^) for all u h G 

U, Vj eV. 

Proof. Let uq = u- and vq = v? . They are fixed for all U{ G U and Vj G V. We have 



(ui,Vj) = (nu [ - l] + u ,r 2 vy ] + v ) 



rir 2 (uj ,«j ) +ri(«| nl ,«o) +r 2 (w ,«j } + («o,^o)- 



LI 

The first term is modulo nr2- The second term is fixed modulo r\T2 because (u\ ,vq) is fixed 
modulo r2- Similarly, the third term is also a constant modulo r\T%. Therefore (v,i,Vj) modulo 
r\ri is the same for all ui € U and Vj £ V. Note that when i = j, (m,Vj) = (mod 7*17*2) since 
(U,V) is an MV family. Therefore (ui,Vj) = (mod rxr%) for all Ui £ U, Vj 6 V. D 

Claim 2.10. Every MV family (U,V) respects (1,1, m). 

Proof. Let iao and i>o be the zero vector. All the conditions are satisfied. □ 

Claim 2.11. If an MV family (U,V) respects (ri,r2,l), then it must has size 1. 

Proof. Since r\r2 = m, by Claim I2U1 we have (Ui,Vj) = (mod m) for all U{ G f7, Vj € V. By the 
definition of MV family, the size of (U, V) must be 1. □ 

3 Proof of the Main Lemma 

Consider an MV family (U, V), where U = (tii, U2, • • • , Ut) and V = (t>i, V2, • • • , Vt). We pick u G U 
and beF uniformly at random and consider the distribution of (u,v)( m '. The inner product is 
with probability 1/t. Thus the distribution is far from uniform when t >> m. We will take 
advantage of this fact and prove our key lemma. For an MV family (U,V) respecting (^1^2,^3), 
we can find a large subfamily and reduce r^ to some smaller number. 

Let / : Z + h>Mbea function satisfying Y2^=2 77JT — 0.99. We will specify f(s) in later proofs. 

Lemma 3.1. If an MV family (U,V) respects (ri,r2,r-^) for some r^ > 2 and \(U, V)\ = t > 
100m, then there exists s \ r 3 with s > 2 and an MV family (U',V) C (U,V) with \(U',V')\ > 
t/(s n ' 2+4 f(s) 2 ) that respects either (ris, r2 1 r^/s) or (ri,r2S,r^/s). 

Proof. We prove the lemma in 4 steps. 



Step 1: Finding a nice character with a large bias. 



By Claim 12.91 ^ u ' v ' is an integer for all u € U,v € V. We can also see ^ u,v ' = (mod rs) iff 

12 12 

(u,v) = (mod m). Consider the distribution of ( y*'^' I € Z r3 , where it and i> are uniformly 
drawn from U and V respectively. We have 



Pr 



nr 2 J 



(rs) 



Pr 



(w, «) = (mod m) 



1 1 
- < 

t ~ 100m 



< 



100r 3 



Applying Lemma I2TT1 on Z r3 , there exists a j 6 {1, 2, . . . ,r3 — 1} such that 



E 



■ { u , v ) 



UJ 



r-.i 



> 



sf(sV 



(3) 



where w r3 = e r 3 and s = cd /? r ■> is the order of cjr 3 - Note that we have dropped the modulo r 3 
operation because (uJr 3 Y 3 = 1. It follows that 



P 



■ (u — u,v) 



UJ. 



''3 



r l r 2 



= E 


E 


r 7 - <•"."> " 




2 
> 


E 

■v~V 


r ,•<«!•">" 




J ri r 2 

UV3 


j rir2 

UV3 



> 



S 2/( S ) 



Therefore there exists a fixed u £ U such that 



E 

v~V 



ui r ,_ 



■ {u — u,v) 
r V2 



E 



i<^»A'2 



Co' 



'■3 



> 



aV(*) 



Since u^ Vl ' = ir Tl ' , we have u — u = r\{v\ ri ^ — w- ri '). The above inequality can be written as 



E 

u~TJ 



LO: 



j(u^-u^\v)/r 2 



> 



s 2 f(s) 



(4) 



Step 2: Partitioning into buckets. 



We partition the set U into buckets according to it™ — u^ 1 ' modulo s: U = |J B(w, U), where 



B(w,U) = <ueU 



W 



i<; > . 



We also partition V into buckets B(w,V) = {v € V | (ij[ r2 ])( s ) = w} for all w € Z™. Define 
Pw = |-B(' u ')^)|/i to be the density of B(w,U) and q w = \B(w,V)\/t be the density of B(w, V). 

Picking u uniformly from U can be equivalently considered as two steps: 1. For each bucket 
B(w,U), pick a representative u w € B(w,U) uniformly; 2. Pick one bucket according to the 
probability distribution p w , and output the representative. For inequality (HJ), we split the procedure 
of picking u ~ U into these two steps. 



1 


< 


E 

■u~U 
■u~V 












s 2 f(s) 2 






= 


E E E 

for each w, r W'^'p- w v^V 


i(u l l] 


-aii] 


,v)/r 2 






u-w r -^B{iv,U) 








< 


E 

for each w, 


E E 




u w 


-Bin] 


,v)/ri 



u w ~B(w,U) 



There exists a fixed list of representatives from each bucket (u w G B(w,U) : w G Z™ ) such that 



sV( S ) 



< 



E 


tu^p-u, 


u~V 



w: 



j(i4? ] -«tn] lt ,>/ 



''2 



''3 



(5) 



For every w € Z™ and u £ B(w, U), we use it' to denote the vector (W ri l — t^ ri J)M. Thus 



.[ri] 



Un., — U l 



su 



w ~r V""iu 



nl _^[r-i]\( S ) = „./ 



«,,, — U 



S«„, + ID. 



Hence inequality (J5J) can be written as 



S 2/(S) 



< 









E 


,,J(*UwH-«>>«)/»'2 





(6) 
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Step 3: Finding a large bucket. 



By inequality ©, 



s 2 f(s) 2 



< 



< 



E X>™-7- 



E 

wezi 

E 



t ^ 



E 






?4 



P™ 



E p ™ 



V^ V^ J_ . i0 j{su' w +w,v-v)/r2 

v,vGV w£_1™ J 



LD,Bey ii»6 



E 

,weZ" 



Pw 






EA • E 



7^1-^ 



(7) 



,ioeZ" 



K w£A 



In the last step we used the fact v^ 2 * ^ it 7 * 2 ' for two i>,i> S V. This can be seen by contrapositive. 
If i>[ r 2] = v*- 1 " 2 ', we have u = v since in 7 * 2 ) = v^" 2 '. This contradicts Claim [231 



By ©, we see either "£Pw > l/(s n/2+2 f(s) 2 ) or X)?2» ^ l/(s n/2+2 /(s) 2 ). Without loss of 
generality, assume E^p^ > l/(s n / 2+2 /(s) 2 ). By 

maxjj?™} = maxjp^} ■ ^Pw > ^Pw, 

there exists a bucket B(wq,U) with size at least t/(s n ' 2+2 f(s) 2 ). Let C/ be that bucket, and V 
be the subset of V with the same indices. Then (U, V) C ([/, V) is an MV family of size at least 
t/{s n / 2+2 f(s) 2 ). Next, we will find a subfamily (U',V') C (Z7, V) that respects (ris,r 2 ,r 3 /s). 
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Step 4: Analyzing the elements in the large bucket. 

Let uq and Vq denote u^ ri ' and v^ 7 " 2 ' respectively for u E U,v E V. For every u E U, we know 
(W ri l — £t^^)' s ' equals the same vector wo by the definition of the bucket. Therefore w ri > — v} ri ' = 
su' + wo and 

it = r\u i ' ri ^ + uq = n(W ri ' + su' + ii?o) + u o = risvf + ( rit*^ + nii^o + Wo ) ■ (8) 

We can see ■u' riS * ) = n«' ri ' + rit^o + Mo is the same for all u E fj. Also i>' r2 ^ = i>o is the same for 
all v E V. These two conditions are still satisfied for any subfamily of (U, V). It suffices to find 
(U',V) C (Z7,F) such that 

• (it' ri ^, «o) modulo r2 is the same for all u E [/'. By ([8]) we have u' 1 " 1 ^ = u' , so we need 
{u',vq) modulo r2 to be the same for all u E U' . 

• (rin' 7 * 1 ' + r\W + wo, in r2 J) modulo r\s is the same for all v E 1/'. 

Since (u^- ri ' , i>o) = (su 7 + W"" 1 ' + w, i;o} modulo r 2 is the same for all u E [/ by (£/, V) respecting 
(n, r%, f3), we can see that s(u', Vq) modulo r 2 is the same for all u E U. Hence there are gcd(s, r 2 ) 
possible values for (u',Vo) modulo r 2 . We pick the most frequent value c\ and keep only the vectors 
with (u',Vq) = c\ (mod r%) in U and the corresponding vectors in V. 

Since (uq,v^' 2 ') modulo T\ is the same for all v E V by (U,V) respecting (r\,r 2 ,rs), we can see 
that there are s possible values for (uQ,v^ r2 '} modulo sri. We pick the most frequent value c 2 and 
keep only the vectors with (uo,v^ r ' 2 ') = c 2 (mod sri) in U and the corresponding vectors in V. 

After the above two steps, the MV family has size at least 

\(U,V)\ \(U,V)\ t 



gcd(s, r 2 )-s~ s 2 ~ s n / 2+4 /(s) 2 ' 
And this is the required (U',V'). D 
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4 Proof of Theorems 11.11 and 11.21 

We now prove Theorem 11.11 by repeatedly applying Lemma 13.11 

Proof of Theorem \l.l\ By Claim [2.101 (U, V) is good with respect to (1,1, m). Initially we set 
r i = 1, r 2 = 1 and r% = m. By Lemma 13. 1} we there is a subfamily that respects (r^, r^, ^3), 
where r^r^r^ = m and r' 3 < m. We repeatedly apply Lemma 13.11 Each round r% is reduced by 
some factor. We can continue this procedure until either r% = 1 or the size of the MV family 
becomes less than 100m. For the case r% = 1, the size of the MV family is also less than 100m by 
Claim [2". Ill Say there are k rounds, and in each round we divide r% by s\, S2, ■ ■ ■ , Sfc respectively. 
We have S1S2 • • • Sfc < m and in the ith round (i E [k]), the size of the MV family is decreased by a 
factor at most s i /(sj) 2 - Therefore the original size is upper bounded by 



k 



\(U,V)\ < 100m Yl V f{*i? < 100m • m n / 2+4 • J]^/^) 2 = 100m n / 2+5 J|/( Si ) 2 - 
Pick f(s) = s 1735 , we can verify ^^2 jk) < °- 99 - Therefore \(U,V)\ < 100m™/ 2+5 (m L735 ) 2 = 

100m r l /2+8.47_ D 

Combining with the lower bound rn n ~ l+ ° m ( 1 > proved in [DGYllj . we can give a universal lower 
bound for the length of the MV code in |DGYllj . This is a restatement of Theorem 11.21 stated in 
the introduction. 



Corollary 4.1. Any MV code (as constructed in [D GYllf ) has encoding length at least N > K 
where K is the message length regardless of the query complexity. 



Proof. Given an MV family in Z^ with size t, we can encode a message of length K = t into a 
codeword of length N = m n . 

If n > 19, by Theorem Owe have K < m n / 2+8A "' '. Hence K < m ( 1/2+8.47/ I9)n < m ±fn = ^ 

19 

and N > Kw. 

If n < 18, it was shown in [DGYllj that K < m^-i+M 1 ). Hence K < m n "T§ < m n ~^ = 



13 



mi9 n = N 19 and N > Km. Note that here we assumed m is sufficient large. This is reasonable 
because we are considering encoding an arbitrarily long message and K is sufficiently large. □ 



5 The case of distinct prime factors 

If m is a product of distinct primes, the bound can be improved to m r v 2 + 4 + ™( 1 ) . The proof follows 
the same outline as general composite m. 

Theorem 5.1. Letm be aproduct of distinct primes. For every MV family (U,V) iriTU^, |(C/, V)| < 
100m n / 2+4+Om( ' 1 ) ; where o m (l) goes to as m grows. 

Proof. The proof is similar to Theorem 11.11 We only sketch the changes here. 

First, we improve the size of (U',V) found in Lemma 13.11 to £/(s n ' 2+2 /(s) 2 ). Since m is a 
product of distinct primes, r\ and r2 must be coprime to s, where s is the number in inequality ([3]). 
Let n and T2 be integers that T\T\ = 1 (mod s) and T2r2 = 1 (mod s), we have 

■ (u,v) 

We partition [/ and V into buckets according to u modulo s and i? modulo s: U = [j B(w,U) 
and V = U 5(io, V), where 

W 



and 



B(w,U) = {ueU | M w =w} 



S(to,V) = {« e V | f (s) = w}. 



14 



We still use p w to denote \B(w,U)\/t and q w to denote \B{w,V)\/t. By inequality (J3]), 



sf(s) 



< 



E 



jj 



J{u,v)t 1 t 2 



T3 



J2 Y.P-'-\- UJ rt ,V)T1T2 



< 



EA- E 



, tu£i 






v iuGZ" 



E_ . f. J<1«,1'>T1T2 



5Z 4- S u--iS w - v ~ i, '' nT - 



f2 Zs r 3 

,v,vev wez™ 



E* ■ E 



iioeZ" 



iioGZ" 



We can see either £)P«> > l/(W 2+1 /(s)) or Y^ g^, > l/(s n / 2+1 /( s ))- Without loss of generality, 
assume *£Pw > l/{s n/2+1 f(s)). By 



max{p w } = max{p w } ■ ^Pw > ^Pw, 



there exists a bucket with size \B(w,U)\ > t/(s n ^ 2+1 f(s)). Let U be that bucket, and V be 
the subset of V with the same indices. Then (U, V) C ({/, V) is an MV family of size at least 

t/(W 2+1 /(s))- 

Next we find (U', V) C ([/", V) using the same method as in Lemma 13. 11 By inequality Q, 



l(ty,V)l> iy» y jl = K^I)i> ' 



5cd(s,r 2 ) • s 



/2+2 /( , 



At last, we use the proof of Theorem 11.11 except f(s) = }- $ . One can verify X^ 2 775T < 0.99. 
Let s\, S2, ■ ■ ■ , Sf. be the numbers divided from r^ in each round, by the proof of Theorem ll.il 



\{U,V)\ < 100mYls r ; /2+2 f( Si ) < 100m n / 2+3 [](3s i ln 2 s i ) < 100m n / 2+4 JJ(31n 2 



i=i 



i=l 



i=l 
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For a sufficiently large integer s, we have 3 In s < s € , where e is an arbitrary fixed small number. 
When m —> oo, all s±, S2, ■ ■ ■ , Sk except a constant number of them must be that large. Take e — )■ 0, 
we have \(U, V)\ < m "/2+4+o m (i) > □ 
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